CHAPTER 1 Biostatistics 101 13

A Matter of Life and Death: Working

with Survival Data

Sooner or later, everyone dies, and in biological research, it becomes especially

important to characterize that sooner-or-later part as accurately as possible using

survival analysis techniques. But characterizing survival can get tricky. It’s possi-

ble to say that patients may live an average of 5.3 years after they are diagnosed

with a particular disease. But what is the exact survival experience? Imagine you

do a study with patients who have this disease. You may ask: Do all patients tend

to live around five or six years, or do half the patients die within the first few

months, and the other half survive ten years or more? And what if some patients

live longer than the observational period of your study? How do you include them

in your analysis? And what about participants who stopped returning calls from

your study staff? You do not know if these dropouts went on to live or die. How do

you include their data in your analysis?

The need to study survival with data like these led to the development of survival

analysis techniques. But survival analysis is not only intended to study the

outcome of death. You can use survival analysis to study the time to the first

occurrence of non-death events as well, like remission or recurrence of cancer,

the diagnosis of a particular condition, or the resolution of a particular condition.

Survival analysis techniques are presented in Part 6.

Getting to Know Statistical Distributions

Statistics books always contain tables, so why should this one be any different?

Back in the not-so-good old days, when analysts had to do statistical calculations

by hand, they needed to use tables of the common statistical distributions to com-

plete the calculation of the significance test. They needed tables for the normal

distribution, Student t, chi-square, Fisher F, and others. Now, software does all

this for you, including calculating exact p values, so these printed tables aren’t

necessary anymore.

But you should still be familiar with the common statistical distributions that may

describe the fluctuations in your data, or that may be referenced in the course of

performing a statistical calculation. Chapter 24 contains a list of commonly used

distribution functions, with explanations of where you can expect to encounter

those distributions and what they look like. We also include a description of some

of their properties and how they’re related to other distributions. Some of them

are accompanied by a small table of critical values, corresponding to statistical

significance at α = 0.05.